Next, we turn our attention to using Twitter data to study protest events, and focus on the 2017 Women’s March protests. Protests are notoriously hard to survey, and Twitter can potentially provide us with valuable insights into who is participating in a demonstration.
Below is a map of all geolocated tweets that were sent on January 21, 2017, the day of the Women’s March protest, showing that users across the world tweeted about the event.
We can manipulate these data into a SpatialPointsDataFrame, making sure the CRS is correctly defined, allowing us to plot the points easily using base R plotting functions. The CRS stands for “Coordinate Reference System,” which controls the “projection” of the map we wish to visualize–i.e., how it looks. For more info. on map projections, see this guide.
Manipulating the data in this way will be helpful when clipping to the boundaries of shapefiles as we go on to describe below.
# let's begin by loading the data
wm_geo <- readRDS("data/geo_raw.rds") %>%
mutate_all(as.numeric)
# generate a spacialpoints dataframe (order must be long/lat)
xy <- wm_geo[,c(1,2)]
points <- SpatialPointsDataFrame(coords = xy, data = xy,
proj4string = CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"))
plot(points)These data have been stripped of user identifying information, including user name, bio etc. Instead we just have two columns: latitude and longitude. The points are from all tweets that contained in the #WomensMarch. When we plot the simple latitude and longitude of the points, we can make out the vague outline of countries.
In order to identify points within the routes of our targeted protest marches, we first read in three shapefiles. Each of these have been created by drawing a buffer of increasing sizes around the route of the 2017 Washington D.C. Women’s March.
route50 <- readOGR("shapes/wm_route_example50.shp", verbose = FALSE)
route100 <- readOGR("shapes/wm_route_example100.shp", verbose = FALSE)
route1000 <- readOGR("shapes/wm_route_example1000.shp", verbose = FALSE)
# we also load an image of the women's march protest
routeimg <- image_read("images/wm_march_map.jpg")We can compare these to the original march route below:
par(mfrow=c(2,2))
plot(routeimg)
plot(route50, main="50m buffer")
plot(route100, main="100m buffer")
plot(route1000, main="1000m buffer")We can create these shapefiles with relative ease in open-source GIS softwares like QGIS.
It is not completely necessary to use these more accurate geographic projections of protest routes, however. In fact, the use of a rectangular bounding box is able to capture these same protestors, with limited cost in terms of inaccuracy. To find the coordinates of a bounding box, we recommend using the open-source OpenStreetMap platform.
As shown below, by searching a location in OpenStreetMap, and selecting the “Export” option at the top of the window, we are able to view the coordinates of the left-upper and right-lower diagonals of the map displayed in the viewer window. The user can zoom in and out on this map in order to select an appropriate geographical area.
To generate a rectangular bounding box object from these four coordinates, we simply need to combine them into a matrix for the purposes of plotting. We can then convert this into a spatial object, and assign the relevant CRS–the same as we assigned to our spatial points above.
We show the coordinates in the image above. From these coordinates, we can easily now generate a SpatialPolygons bounding box by combing the x1, y1, x2, and y2 coordinates into a matrix
#bounding box
x1<- -77.0722
y1<- 38.9145
x2 <- -76.9786
y2 <- 38.8660
coords = matrix(c(x1, y1,
x1,y2,
x2,y2,
x2,y1,
x1,y1),
ncol = 2, byrow = TRUE)
P1 <- Polygon(coords)
bb <- SpatialPolygons(list(Polygons(list(P1), ID = "a")),
proj4string=CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"))
plot(bb)We compare our bounding box shape to our route buffer shapes below.
#set consistent CRS
CRS.new<-CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
proj4string(route50) <- CRS.new
proj4string(route100) <- CRS.new
proj4string(route1000) <- CRS.new
par(mfrow=c(2,2))
plot(bb)
plot(route50,add=T)
plot(bb)
plot(route100,add=T)
plot(bb)
plot(route1000,add=T)
plot(bb)We can see that our rectangular bounding box shape covers a larger area than march route shapes. If we think this bounding box is too large, we can always reduce it in size by lifting new coordinates from OpenStreetMap, converting to a matrix, and generating a smaller spatial bounding box. For now, we will continue with the bounding box we have generated.
par(mfrow=c(2,2))
plot(route50)
pts_subset <- points[route50,]
plot(pts_subset, add=T, col="red")
plot(route100)
pts_subset <- points[route100,]
plot(pts_subset, add=T, col="red")
plot(route1000)
pts_subset <- points[route1000,]
plot(pts_subset, add=T, col="red")
plot(bb)
pts_subset <- points[bb,]
plot(pts_subset, add=T, col="red")plot(bb, axes=T)
pts_subset <- points[bb,]
plot(pts_subset,add=T)
plot(route50, add=T)
pts_subset <- points[route50,]
plot(pts_subset, add=T, col="red")plot(bb, axes=T)
pts_subset <- points[bb,]
plot(pts_subset,add=T)
plot(route100, add=T)
pts_subset <- points[route100,]
plot(pts_subset, add=T, col="red")plot(bb, axes=T)
pts_subset <- points[bb,]
plot(pts_subset,add=T)
plot(route1000, add=T)
pts_subset <- points[route1000,]
plot(pts_subset, add=T, col="red")One of the challenges with Twitter data is that it is unclear whether someone who tweets about a protest actually participates in it. Information on the geo-location of users allows us to assess whether or not a user tweeted from within the protest march.
The above, using open source GIS softwares, means we can easily locate individuals to within the route of a protest march, providing a confident measure of participation.
Once we have located our protestor-users, the estimation of their ideological position (based on their follow network) is straightforward using the tweetscores package by Pablo Barberá. We will not estimate the ideologies of our users above as they have been anonymized. But you can certainly look at your own: simply change the user name to your own Twitter username.
Note: you will also need to use the authentication token you created above to download the following network of Twitter users. For more information follow the steps outlined by Barberá here
This is only the beginning of what we can do with Twitter data. The code below uses the tweetbotornot2 package by Michael Kearney, the author of the rtweet package, and predicts how many accounts in our Black Lives Matter tweets dataset are likely bots.
We plot a histogram of predicted likely bot accounts below, along with a short selection of some of the tweets.
library(remotes) # install remotes package if necessary
library(tweetbotornot2) # install from github if necessary
bots_p <- predict_bot(blm_tweets, )## [20:11:22] WARNING: amalgamation/../src/learner.cc:790: Loading model from XGBoost < 1.0.0, consider saving it again for improved compatibility
| screen_name | text |
|---|---|
| heyman_bot | #BLM |
| PortsideOrg | Amy Sherald Directs Her Breonna Taylor Painting Toward Justice #BreonnaTaylor #Louisville #policekillings #DefundthePolice #police #AfricanAmericanart #BLM #BlackLivesMatter https://t.co/HXUtAerPeS |
| culturereviewed | When Wits black students were fighting for the doors of learning to be open as the Freedom Charter promises, police responded with violence. Those are the “black crowd control” techniques they know. Nothing else #WitsProtest #BlackLivesMatter #CReview |
| say_the_names | Willie McCoy #BlackLivesMatter |
| say_the_names | Michael Dean #BlackLivesMatter |
| BLMProtestBot | If you’re reading this, remember that #BlackLivesMatter |
| say_the_names | John Crawford III #BlackLivesMatter |
| tbasharks | Wed Jul 29 2020 Portland, Oregon - Independent journalist arrested WATCH: https://t.co/3PUJqzKHfX #PortlandOregon #PPD #blacklivesmatter #blm #defundthepolice #abolishthepolice |
| tbasharks | Sat May 30 2020 Austin, Texas - Police shoot non-violent protester in the head WATCH: https://t.co/tMCKqCxjkp #AustinTexas #APD #blacklivesmatter #blm #defundthepolice #abolishthepolice |
| TheLivesThatMtr | Say their name. Derrick Lee Hunt, 2015-08-07 #BlackLivesMatter. |